AITopics | markov transformer

01a0683665f38d8e5e567b3b15ca98bf-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 07:27:12 GMT

arxiv preprint arxiv, machine translation, transformer, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Cascaded Text Generation with Markov Transformers

Neural Information Processing SystemsDec-23-2025, 16:44:41 GMT

The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies. This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output. To parameterize this cascade, we introduce a Markov transformer, a variant of the popular fully autoregressive model that allows us to simultaneously decode with specific autoregressive context cutoffs. This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.

autoregressive model, cascaded text generation, name change, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Cascaded Text Generation with Markov Transformers

Neural Information Processing SystemsOct-1-2025, 21:56:03 GMT

This work proposes an autore-gressive model with sub-linear parallel time generation.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

01a0683665f38d8e5e567b3b15ca98bf-AuthorFeedback.pdf

Neural Information Processing SystemsOct-1-2025, 21:49:00 GMT

machine learning, natural language, ngram score, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Artificial Intelligence > Natural Language (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.30)

Add feedback

Review for NeurIPS paper: Cascaded Text Generation with Markov Transformers

Neural Information Processing SystemsJan-21-2025, 03:08:11 GMT

Weaknesses: While I am advocating for this paper's acceptance, I'm curious as to whether the authors think this will truly be the dominant approach going forward in this area. I find this approach theoretically more appealing than the Levenshtein transformer, but I think the "global communication" as a negative feature of that model isn't strictly a negative. Sure, the more local nature of this one gives a speedup. But successfully capturing long-range dependencies is one of the things transformer models like GPT-3 seem to be good at. This is a limitation of the paper only evaluating on MT; in MT, the input heavily constrains the shape of the output and long-range output dependencies may not be quite as necessary.

cascaded text generation, markov transformer, neurips paper, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Review for NeurIPS paper: Cascaded Text Generation with Markov Transformers

Neural Information Processing SystemsJan-21-2025, 03:08:03 GMT

This paper proposes a semi-autoregressive neural text generation method with cascased transformer where candidate outputs are pruned during decoding through scoring with CRF models of increasing context length. The method supports parallel computing in inference which lead to 7x speed compared to autoregressive method with loss, which is better than most existing non-autoregressive methods. Reviewers all agree that the idea is novel and well develolped. The experiments are extensive and show significant benefit over existing auto-regressive and non-autoregressive methods.

cascaded text generation, markov transformer, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.71)

Add feedback

Cascaded Text Generation with Markov Transformers

Neural Information Processing SystemsOct-9-2024, 09:50:29 GMT

The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies. This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output. To parameterize this cascade, we introduce a Markov transformer, a variant of the popular fully autoregressive model that allows us to simultaneously decode with specific autoregressive context cutoffs. This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.

autoregressive model, cascaded text generation, markov transformer

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Revisiting the Markov Property for Machine Translation

Du, Cunxiao, Zhou, Hao, Tu, Zhaopeng, Jiang, Jing

arXiv.org Artificial IntelligenceFeb-3-2024

In this paper, we re-examine the Markov property in the context of neural machine translation. We design a Markov Autoregressive Transformer~(MAT) and undertake a comprehensive assessment of its performance across four WMT benchmarks. Our findings indicate that MAT with an order larger than 4 can generate translations with quality on par with that of conventional autoregressive transformers. In addition, counter-intuitively, we also find that the advantages of utilizing a higher-order MAT do not specifically contribute to the translation of longer sentences.

markov model, markov property, translation, (14 more...)

arXiv.org Artificial Intelligence

2402.02084

Country:

Asia > Singapore (0.05)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Cascaded Text Generation with Markov Transformers

Deng, Yuntian, Rush, Alexander M.

arXiv.org Machine LearningJun-1-2020

The two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies. This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output. To parameterize this cascade, we introduce a Markov transformer, a variant of the popular fully autoregressive model that allows us to simultaneously decode with specific autoregressive context cutoffs. This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2006.01112

Country: